Logical to physical data storage mapping

ABSTRACT

A method operable with a storage system comprises processing an Input/Output (I/O) request to a storage device, extracting a logical mapping unit from the I/O request, determining that the I/O request is for variable length data, and accessing a map that links the logical mapping unit to one or more physical addresses of the storage device. The method also comprises calculating a number of physical mapping units at the physical addresses to service the I/O request.

SUMMARY

Systems and methods presented herein provide for logical unit mapping tophysical mapping units of a data storage device. In one embodiment, amethod includes processing an Input/Output (I/O) request to a storagedevice and extracting a logical mapping unit from the I/O request. Themethod also includes determining that the I/O request is for variablelength data, and accessing a map that links the logical mapping unit toone or more physical addresses of the storage device. The method alsoincludes calculating a number of physical mapping units at the physicaladdresses to service the I/O request.

The various embodiments disclosed herein may be implemented in a varietyof ways as a matter of design choice. For example, the system and methodembodiments hereof may take the form of computer hardware, software,firmware, or combinations thereof. Other exemplary embodiments aredescribed below.

BRIEF DESCRIPTION OF THE FIGURES

Some embodiments are now described, by way of example only, and withreference to the accompanying drawings. The same reference numberrepresents the same element or the same type of element on all drawings.

FIG. 1 is a block diagram of an exemplary storage system.

FIG. 2 illustrates an exemplary mapping of logical mapping units tophysical mapping units.

FIGS. 3 and 4 are exemplary mapping structures of the storage system.

FIG. 5 is a flowchart of an exemplary process of the storage system.

FIG. 6 is a block diagram of exemplary garbage collection units forvariable length data of the storage system.

FIG. 7 is a flowchart of another exemplary process of the storagesystem.

FIG. 8 is a block diagram of an exemplary garbage collection unit forfixed length data of the storage system.

FIG. 9 is a block diagram of another exemplary garbage collection unitfor fixed length data of the storage system.

FIGS. 10 and 11 illustrate exemplary orders of logical mapping units ofthe storage system.

FIGS. 12 and 13 illustrate exemplary data deduplication mappingstructures and tables of the storage system.

FIG. 14 is a block diagram of an exemplary storage controller and itsassociated storage device.

FIG. 15 is a block diagram of an I/O module comprising storage devicesand their associated controllers interfacing with a host system.

DETAILED DESCRIPTION OF THE FIGURES

The figures and the following description illustrate specific exemplaryembodiments. It will thus be appreciated that those skilled in the artwill be able to devise various arrangements that, although notexplicitly described or shown herein, embody certain principles that areincluded within the scope of the embodiments. Furthermore, any examplesdescribed herein are intended to aid in understanding the principles ofthe embodiments and are to be construed as being without limitation tosuch specifically recited examples and conditions. As a result, theembodiments are not limited to any specific examples described below.

FIG. 1 is a block diagram of an exemplary storage system 10 that isoperable to store and read data 12 resulting from I/O requests by a hostsystem 11. The storage system 10 comprises a controller 20 that controlshow and where the data 12 is persistently stored on a storage device 30.The controller 20 employs a map 21 that maps data 12 to and from thestorage device 30. For example, the host 11 maintains an address list oflogical mapping units (LMUs) where data is stored and where data is tobe written. These LMUs, however, are not the actual locations for thedata 12 in the storage device 30. Rather, logical mapping unit addresses(LMUAs) 26 of the address list are used to link to physical mapping unitaddresses (PMUAs) 25 in the map 21 as the controller 20 maintainscontrol over where data 12 is stored and where data 12 is to be writtenin the storage device 30. This is due to the fact that the controller 20may need to perform certain operations on the data 12 from time to time,such as garbage collection and data deduplication. So, the controller 20maintains the map 21 that links the LMUAs 26 of the data 12 to theircorresponding PMUAs 25 of the storage device 30, thereby abstracting thephysical mapping units (PMUs) of the storage device 30 from the hostsystem 11.

Thus, when the host system 11 needs to read data from the storage device30, the host system 11 configures a read I/O request that includes aLMUA 26. The host system 11 transfers the read I/O request to thecontroller 20 where the I/O request goes through a read path 23 toextract the LMUA 26 from the I/O request. The controller 20 processesthe LMUA 26 via the map 21 to identify the PMUA 25 of the data 12. Inother words, the controller 20 translates the LMUA 26 from the hostsystem 11 to the PMUA 25 using the map 21 to identify which PMU 32 ofthe storage device 30 storing the data. The controller 20 then uses thePMUA 25 to retrieve the data 12 from the PMU 32 of the storage device 30and return it to the host system 11.

Similarly, when the host system 11 needs to write data to the storagedevice 30, the host system 11 configures a write I/O request thatincludes a LMUA 26 and the data 12 to be written to the storage device30. The host system 11 transfers the write I/O request to the controller20 where the I/O request goes through a write path 24 to extract theLMUA 26 and the data 12 from the I/O request. The controller 20processes the LMUA 26 and writes the data 12 to a PMU 32 of the storagedevice 30 with a certain address PMUA 25. Once the data is stored, thestorage device 30 reports the PMU 32 to the controller 20, which in turnupdates the map 21 to link the LMUA 26 to the appropriate PMUA 25 of thePMU 32.

These processes may be used in a variety of storage device types, suchas hard disk drives (HDDs) and Solid State Drives (SSDs). SSDs (e.g.,NAND flash devices including multi-dimensional NAND flash devices, suchas 3D NAND flash devices, etc.) do not use the moving mechanicalcomponents that an HDD does. Instead, these storage devices useintegrated circuitry as memory cells to persistently store data. Thememory cells are arranged in “pages”, which are arranged in “blocks”.And, the blocks are arranged on a “plane” of a die. The controller 20 insuch a case writes data to the pages of the blocks and manages how andwhere that data is stored and changed via subsequent I/O requests.

In some embodiments, the storage device 30 may be configured using oneor more SSD architectures, such as Single Level Cell (SLC) architecturesand Multi-Level Cell (MLC) architectures. An SLC architecture allows amemory cell to store one bit of data. Traditionally, an MLC architecturemeant that a memory cell could store two bits of data. But,architectures have evolved and now provide even higher levels ofdensity, such as Triple Level Cell (TLC) architectures that store threebits per memory cell, and Quad Level Cell (QLC) architectures that storefour bits per memory cell. Generally, though, any architecture storingmore than one bit of data per cell may also be referred to as an MLCarchitecture.

Other examples of the storage device 30 include magnetic recordingdevices, such as shingled magnetic recording (SMR) mediums where tracksare shingled upon one another to increase data storage capacity. Forexample, a write head of an SMR drive overlaps tracks of an SMR drive.Thus, when writing to one track of the SMR drive, the write head maydisturb the data of an adjacent track. So, the SMR drive marks thedisturbed data as invalid and performs a sort of garbage collection onthose tracks, similar to an SSD. Still, other examples of the storagedevice 30 include phase-change memory, resistive Random Access Memory(RAM), magnetoresistive storage devices (e.g., such as magnetoresistiveRAM, or “MRAM”, Spin-Transfer Torque MRAM, or “STT-MRAM”, etc.), andvarious combinations of the examples herein.

It should be noted that, while the I/O requests can and often do comedirectly from a host system, the I/O requests may be cached in anotherdevice, such as a buffer, before being executed by the controller 20, ormay even be issued by other storage devices themselves. Accordingly, theembodiment is not intended to be limited to any particular type of I/Orequest. Based on the foregoing, the controller 20 is any device,system, software, firmware, or combination thereof operable to serviceI/O requests to read data from and write data to the storage device 30and to maintain the integrity of the data in the storage device 30.

FIG. 2 illustrates an exemplary mapping of LMUAs 26 to PMUAs 25. In thisembodiment, the data 12 is illustrated as being variable length,although the controller 20 is operable to store both variable lengthdata and fixed length data. For example, some data is compressed to savespace within the storage device 30, causing it to be variable in length.Other data 12 has a fixed length, such as 1 kB, 2 kB, 4 kB, 8 kB, and 16kB, which may correspond to page sizes of an SSD.

Fixed length data is generally easier for the controller 20 to handlebecause the storage device 30 may be configured to store data in such amanner. For example, in an SSD, each package contains one or more dies(e.g., one, two, or four). The die is generally the smallest unit thatcan independently execute commands or report status. Each die containsone or more planes (e.g., one or two). Identical and concurrentoperations can take place on each plane, although with somerestrictions. Each plane contains a number of blocks, which are thesmallest unit that can be erased. And, each block contains a number ofpages, which are the smallest unit that can be programmed (i.e., writtento). Generally, fixed length data corresponds to the page sizes of SSDs(e.g., 1 kB, 2 kB, 4 kB, 8 kB, and 16 kB). So, the controller 20 cantell exactly where data starts and ends by the logical to physicalmapping units.

To illustrate, a page of an SSD may have 8 kB of physical storagelocated at a PMUA 25. That PMUA 25 is linked to a LMUA 26. When fixedlength data of 8 kB is stored at the PMUA 25, the data 12 (including anyheaders, padding, etc.) starts and ends with the PMU 32 that is thephysical page of the SSD. So, to access the data 12 at that page, thecontroller 20 converts the LMUA 26 to the PMUA 25 using the map 21 toretrieve the data 12.

Variable length data, however, presents other challenges because thedata 12 may span more than one page and/or even fractions of a page. Toillustrate, the data 12-1-data 12-3 are stored at PMU 32-1 (also labeledPMUA1/PMU1) with their headers 31-1-31-3. The headers 31-1-31-3 tell thecontroller 20 the lengths and offsets of each piece of data 12-1-12-3 atthe PMU 32-1 of the storage device 30. The data 12-3, however, is splitinto two parts with the remaining portion written at the PMU 32-2 (alsolabeled PMUA2/PMU2). A part of the data 12-4 is also written to the PMU32-2 along with its header 31-4 with its remainder being writtenelsewhere (not shown for simplicity).

The map 21 links the LMUAs 26 to the PMUAs 25 and their correspondingPMUs 32 of the storage device 30. The map 21 also includes the number ofPMUs (NPMU) 27 that each piece of data 12 occupies. The NPMU, along withthe headers 31, allows the controller 20 to know where each piece ofdata 12 begins and ends. In this example, the LMUA 26-1 pointing to PMUA25-1 shows it has an NPMU 27 of “1”, meaning that all of data for theLMUA 26-1 can be found at the PMU 32-1. The same goes for the data 12-2associated with LMUA 26-2. The data 12-3 associated with LMUA 26-3,however, has an NPMU of “2” meaning that the data 12-3 spreads over thePMU 32-1 and the PMU 32-2. The data 12-3's header 31-3 shows where thedata 12-3 begins (i.e., via the offset) and ends (via the length).Similarly, the data 12-4 associated with LMUA 26-4 has an NPMU of “3”meaning that the data 12-4 spreads across three PMUs 32, with the firstpart being located at PMU 32-2 (again, the other PMUs 32 not being shownfor simplicity).

When the controller 20 needs to retrieve the data 12, the controller 20processes the LMUA 26, locates the associated header 31, and reads outthe data 12 associated with that header 31. To illustrate, when thecontroller 20 receives a host read request for LMU 26-3, the controller20 locates the PMUA 26-1, the address of the PMU 32-1 where the data12-3 is located in the storage device 30. In this instance, the data12-3 associated with LMUA 26-3 begins at PMUA 32-1 and has an NPMU of“2”. The controller 20 then understands that the data 12-3 associatedwith the LMUA 26-3 spreads across PMUs 32-1 and PMU 32-2. The header31-3 associated with the data 12-3 will be located in the same PMU 32-1with the first part of the data 12-3. The controller 20 locates theheader 31-3, determines the length of the data 12-3 and its offset inthe PMU 32-1 of the storage device 30, and reads out the first part ofthe data 12-3 in the PMU 32-1. The controller 20, knowing the length ofthe data 12-3 from the header 31-3, then reads out the remaining portionof the data 12-3 from the PMU 32-2 of the storage device 30 at theaddress PMUA 26-2.

Generally, the map 21 is stored in Random Access Memory (RAM), such asdouble data rate (DDR) RAM. The NPMU field 27 in the map 21, whencombined with a LDATA field that shows the length of the data 12 (e.g.,that may be scaled by some constant) being stored at a PMU 32, increasesthe size of the map 21 and reduces the amount of mapping that can becached in the RAM. This, in turn, slows down host I/Os because theentirety of the map 21 cannot be accessed from the RAM. Instead,portions of the map 21 are stored within the slower access storagedevice 30. In this regard, a new mapping structure for the map 21 isintroduced that removes the NPMU and/or the LDATA fields from the map 21such that more of the map 21 may be stored in RAM.

FIGS. 3 and 4 are mapping structures for host I/O requests to thestorage device 30. In these embodiments, certain fields are removed fromthe mapping structure of the map 21 so as to reduce the map size andimprove performance of the storage system 10. For example, FIG. 3illustrates a mapping structure 40 in which the LMUA 26 is linked to thePMUA 25. In the mapping structure 40, the NPMU field 27 and the “other”field 28 are configured in the map 21 without the LDATA field. FIG. 4illustrates the mapping structure 41 with the LDATA field 29 and theother field 28. Thus, the mapping structure 40 is devoid of the LDATAfield 29 and the mapping structure 41 is devoid of the NPMU field 27. Byremoving these fields from these mapping structures 40 and 41, the sizeof the map 21 is reduced. However, this information is still needed bythe controller 20 to determine the number of PMUs 32 that data 12 isspread across and/or how long the data 12 is. The embodiments belowexemplify how this information is still obtained.

For example, during a write operation, the host system 11 may form awrite I/O request comprising a LMUA field 26. The LMUA 26 is linked to aPMUA field 25, an NMPU field 27, and an “other” field 28. The write I/Orequest directs the controller 20 to locate a PMU 32 to store the data12. After the data is written, the mapping structure 40 is updated inthe map 21 with the NPMU field 27 to inform the controller 20 as to howmany PMUs 32 of the storage device 30 the data 12 occupies. The otherfield 28 is used for miscellaneous information pertaining to the data 12to be stored.

The mapping structure 41 includes the LMUA 26 linking to the PMUA 25where the data is located. The mapping structure 41 also includes theLDATA field 29 and the other field 28. Missing from this mappingstructure 41 is the NPMU field 27. Again, removing this field from themapping structure 41 reduces the overall size of the map 21. But, thecontroller 20 still needs to know how many PMUs 32 are being used tostore the data 12. In this embodiment, however, the NPMU information canbe calculated by the controller 20 on the fly, as will be discussedbelow.

Now, turning to the flowchart of FIG. 5, a process 50 used by thecontroller 20 is presented. In this embodiment, the controller 20extracts a LMUA 26 from an I/O request, in the process element 51. Then,the controller 20 accesses the map 21 that links the LMUA 26 to one ormore PMUAs 25 of the storage device 30 (e.g., via the mapping structure41), in the process element 52. The controller 20 then determineswhether the data 12 is variable length data, in the process element 53.If so, the controller 20 calculates the NPMU on the fly to determine howmany PMUs 32 the data 12 spans in the storage device 30, in the processelement 54. From there, the controller 20 locates the data 12 and itsassociated header 31 to retrieve the data 12 from the PMUs 32 located atthe PMUAs 26, in the process element 55. If the data 12 is fixed lengthdata, the controller 20 simply goes to the PMU 32 linked to the LMU 26via the PMUA 25 to retrieve the data from the PMU 32.

Before delving into the discussion of calculating the NPMU on the fly, adiscussion of garbage collection with respect to the mapping structure40 is presented as new garbage collection procedures are employed withthe new mapping structures 40 and 41. Generally, garbage collection isperformed on a unit of the physical storage space, called a garbagecollection unit (GCU). Two GCUs 60-1 and 60-2 are exemplarilyillustrated in FIG. 6. To assist with garbage collection, each GCU 60maintains some garbage collection statistics (GCSs), such as number ofvalid data bytes. The controller 20 generally picks the GCU 60 with thesmallest GCS for recycling. The embodiments herein provide for updatingof the GCSs.

To illustrate, the GCS is updated upon each host write. In steady state,two GCUs 60 (e.g., GCUs 60-1 and 60-2) are involved in GCS updates uponeach host write I/O request for a LMUA 26. The GCU 60-2 is where theLMUA 26's new data is written, and the GCU 60-1 is where the LMUA 26'sold data resides. With the mapping structure 40, the GCS may take aformat of a TLMU−VLMU pair, where TLMU is the total number of LMUAs 26stored in a GCU 60, and VLMU is the total number of LMUAs 26 with validdata. Naturally, VLMU<=TLMU. Initially VLMU=TLMU=0 for an empty GCU 60,and both values increase by “1” upon every LMUA 26 having its datawritten to the GCU 60-2 by the controller 20. This invalidates the LMUA26's old data in the GCU 60-1 (as indicated by the crossing) and theVLMU of the GCU 60-1 thereby decreases by “1”.

For example, when the host 11 needs to write the data 12-21 to the GCU60-2, the VLMU=TLMU=0, initially. And, the GCU 60-1 has a VLMU=TLMU=16,since there are 16 headers (i.e., headers 31-1-31-16) in the GCU 60-1.Then, the controller 20 writes the data 12-21 to the PMU 32-21 in theGCU 60-2, leading to an increment of VLMU=TLMU=1 by the controller 20for the GCU 60-2. Meanwhile, the LMUA 32-1's old data 12-1 resides inthe GCU 60-1. Hence, the controller 20 decreases the VLMU of the GCU60-1 by 1, from 16 to 15. Note that since the GCU 60-1 is full, the TLMUof the GCU 60-1 is still 16. In the GCU 60-2, the VLMU=TLMU and thecontroller 20 will keep updating until the GCU 60-2 is full, as shown inthe flow chart of FIG. 7.

For example, FIG. 7 illustrates a process 70 in which the controller 20generates the GCS for recycling of the GCUs 60. As data 12 enters theGCU 60-2, the controller 20 increments the TLMU of the GCU 60-2, in theprocess element 71. Then, the controller 20 increments the VLMU of theGCU 60-2 while decrementing the VLMU of the GCU 60-1, in the processelement 72. From there, the controller 20 determines whether the GCU60-2 is full, in the process element 73. If so, then the controller 20moves on to the next GCU 60 (e.g., a GCU 60-3) to write data 12, in theprocess element 74. Otherwise, the controller 20 continues writing tothe GCU 60-2 until it is full.

If a LMU 26's old data resides in the same GCU 60 as the new data, thecontroller 20 will not change the VLMU, but increases the TLMU by only1.

When recycling starts, the GCS in the format of TLMU−VLMU pair is usedto decide which GCU 60 to pick for recycling. For example, a valid datarate RVALID may be calculated as follows:

RVALID=VLMU/TLMU.

In this regard, the controller 20 picks the GCUs 60 with the lowestvalues of RVALID for recycling. If two candidate GCUs 60 have the samelowest valued RVALID but different VLMUs or different TLMUs, thecontroller 20 may pick one GCU 60 at random or choose one that producessome desired results. For example, the GCU 60-1 may have a VLMU1=2, aTLMU1=4, and an RVALID1=2/4=1/2. The GCU 60-2 may have a VLMU2=8, aTLMU2=16, and an RVALID2=8/16=1/2, which is the same as RVALID1.However, recycling the GCU 60-1 will most likely take less time sincethere are less TLMUs for a validation check and less VLMUs to movearound. When the controller 20 is busy catching up with a write commandfrom the host system 11, or is in urgent need of free space, thecontroller 20 may choose to recycle the GCU 60-1 first. When thecontroller 20 is not busy (e.g., such as idling, or when there is stilla sufficient amount of free space to serve the host system 11), thecontroller 20 may choose to recycle the GCU 60-2 first. Similarly, evenif a GCU 60 does not have the lowest RVALID but has the smallest VLMU orthe smallest TLMU, the controller 20 may still choose to recycle thisGCU 60 for a shorter recycling time, to create more new free space,and/or for other performance reasons.

Further, if the original GCS in the format of total number of valid databytes DVALID, or its scaled version such as a DVALID in unit of 1 kB, isalso available, the controller 20 may consider a new garbage collectionpolicy by using both the DVALID and RVALID for performance reasons. Forexample, if two candidate GCUs have the same lowest RVALID but adifferent DVALID, the one with the smaller DVALID may be recycled firstto reduce overhead from recycling a write.

The idea behind this GCS update by counting is to use a universalcompressed data length LCMPD for every LMUA 26 stored in one GCU 60based on an average compression rate RCMP, if the data 12 has a variablelength (e.g., as a result of compression). The RCMP is generally definedas:

RCMP=(TLMU*SLMU)/(TPMU*SPMU), where SLMU is the size of one LMU inbytes, SPMU is the size of one PMU 32 in bytes, TLMU is the total numberof LMUAs 26 in one GCU 60, and TPMU is the total number of PMUs 32 inone GCU 60.

Returning to FIG. 6, the TPMU=8 and the TLMU=16 given that the GCU 60-1is composed of 8 PMUs 32 and contains 16 LMU headers 31 (i.e., theheaders 31-1-31-16). Now, assume that the SLMU=SPMU=4 KB, then theRCMP=(16*4 KB)/(8*4 KB)=2. On average, each LMUA 26 has aLCMPD=SLMU/RCMP. In the GCU 60-1, LCMPD=4 KB/2=2 KB. Hence, with thecontroller 20 reducing the VLMU by 1 in the GCU 60-1, the valid numberof bytes is equivalently reduced by 2 kB.

The definition of RCMP indicates its equivalency with the TLMU. Hence,the RCMP−VLMU pair can be used as another format for computing GCS as analternative to the TLMU−VLMU pair. At any time, the size of valid dataLVALID in bytes of each GCU 60 can be calculated on-the-fly as follows:

LVALID=VLMU*SLMU/RCMP.

The controller 20 can then use the LVAILD in a garbage collection policyto decide which GCU 60 to pick for recycling (or for other host system11 usage) using the LVALID alone, or in combination with the VLMU,and/or other real-time performance metrics.

In yet another embodiment of GCS computations, the total valid number ofNPMU per GCU 60 (VPMU) can be used in a garbage collection policy. Forexample, VPMU can be considered as the summation of all NPMU values fromthe map 21 entries that have their PMUs 32 within a current GCU 60. Eachtime a LMUA 26 is written to the GCU 60-2, an NPMU of its map entry willbe added to the VPMU of the GCS 60-2, while the NPMU of its old mapentry will be subtracted from the VPMU of the GCS 60-1, if the old datastill resides in the GCU 60-1. When garbage collection starts, thecontroller 20 will pick the GCU 60 with the smallest VPMU values forrecycling. This approach can be viewed as a compressed data lengthrepresented in a granularity of the PMUAs 25.

With the mapping structure 41, the LDATA field is retained but the NPMUfield is omitted. To accomplish host read I/O requests, the controller20 calculates an estimate of the NPMU (NPMU−EST) on the fly as follows:

NPMU−EST=LDATA/SPMU−DATA,

where SPMU−DATA is the smallest number of data bytes stored in one PMU32, excluding the size of the headers 31, parity bytes from errorcorrection coding, etc. The controller 20 rounds up the division ofLDATA/SPMU−DATA to the closest integer. Hence, NPMU−EST>=NPMU to ensurethat all the host 11 data 12 of one LMUA 26 is read out from the storagedevice 30 upon a host read I/O request. For example, if LDATA=3096 andSPMU−DATA=3000, then NPMU−EST=2.

With the new mapping structures 40 and 41, data deduplication for thestorage system 10 is also presented. For example, the mapping structures40 and 41 are responsible for translating logical block address from thehost system 11 to physical block address of the storage device 30. Howdata is written to the storage device 30 and the dependence on specificcomponents in a datapath are generally irrelevant to the mappingstructures 40 and 41. But, both compression and deduplication are stillpart of the datapath and may result in different entry values and/or theaddition (or deletion) of certain fields in the map 21 when compared toa datapath without compression and deduplication. So, a datapath withdeduplication may reuse the mapping structures 40 and 41 and/or theirvariants, such as the mapping structures 42 and 43 in FIGS. 12 and 13.

In this embodiment, deduplication removes the data 12 that is a replicato a previous data 12. For example in FIG.8, host data of the LMUA 26-31is the data 12-31, while the host data of the LMUA 26-33 is the data12-33 and contains the same content as the data 12-31. The data 12-33 isa duplicate of the data 12-31, and can be removed to save storage spacein the storage device 30. In this case, the LMUA 26-33 is associatedwith the data 12-31 so that the correct data 12 can be retrieved for thehost 11 upon a read request. In these embodiments, such association isestablished by allowing the LMUA 26-33's header 31-33 to point to thedata 12-31 similar to the header 31-31, leading to multiple headers 31pointing to a same data 12.

In general, deduplication does not change the size of the host 11's data12. That is, each piece of data 12 in a PMU 32 has a fixed size SLMU.Thus, the headers 31 for these bodies of data 12 can be implementedwithout a length field, and their corresponding entries in the map 21can go without an LDATA field and thus use the mapping structure 40. Anexample of such is shown in FIG. 8 with the header 31-31, where only theLMUA 26-31 and its offset are recorded for the data 12-31 in theGCU-60-3, which is also pointed to by the LMUA26-33, stored in theheader 31-33 together with its offset. Depending on the size of the PMU32, the data 12 of one or more LMUAs 26 (e.g., LMUAs 26-31-26-33) may bestored in one PMU 32 (e.g., in PMU 32-31) along with their correspondingheaders 31 (e.g., headers 31-31-31-33).

FIG. 9 illustrates smaller PMUs 32 that accommodate one fixed size data12 but pointed to by multiple headers 31, such as data 12-41 pointed toby header 31-41, header 31-43 and header 31-44. In this case, no offsetfield is necessary in the headers 31 because the data 12 can be storedwith a fixed offset, such as always from the beginning of each PMU 32.As such, only the LMUA 26 field is recorded in the header 31. Forexample, in the PMU 32-41, only the LMUA 26-41 is recorded in the header31-41.

In FIG. 9, each PMU 32 of the GCU 60-4 has at most three headers (e.g.,headers 31-41-31-44 at PMUA 32-41) due to limited size of the PMU 32.Generally, the controller 20 determines the difference between the sizeof one PMU 32 and the size of one LMU 26, and then divides it by thesize of one header 31 to get the maximum number of headers 31. However,it is usually a good practice to set the number of headers 31 to asmaller number. The reason to limit the number of headers per PMU 32 isthat some spare bits should be reserved for logging, error correctionparity check, to name a few, as indicated by “Other” field in FIG. 9. Asa result, deduplication efficiency can be reduced.

For example, with three headers 31 being used for three LMUAs, there arethree LMUAs that are sharing the same data, instead of three differentpieces of data, thereby saving two data pieces of space in the storagedevice 30. In this embodiment, three headers 31 are used as the maximumnumber of headers 31 allowed by the size of a PMU 32, such that one,two, or even three headers 31 may appear in one PMU 32. The reason thattwo headers 31 may appear is because there may be only two LMUs havingthe same data, while a third LMU has a different data (e.g., no longer aduplication), according to host data patterns, which are out of controlof the storage device 30.

While any number of headers 31 may be used, a larger number may savemore space. For example, if a maximum of five headers 31 are allowed ina PMU 32 and 5 duplicates of the same data 12-41 are seen, then thecontroller 20 may add the headers 31-45 and 31-47 into the PMU 32-41 aswell, saving the entire PMU 31-43 to be used for storing other hostdata.

With a maximum of three headers 31 in one PMU 32, one header 31-42appears in the PMU 32-42, and two headers 31-45 and header 31-47 appearin the PMU 32-43. The data 12-41 is stored twice at the PMU 32-41 andPMU 32-43. This is due to the fact that the headers 31-45 and 31-47cannot be stored at the PMUA 32-41, which already stores three headers31.

To improve deduplication efficiency in this case, incoming host writeI/O requests may be reordered such that duplicates of the data 12 can begrouped together and written into a same PMU 32. To obtain the sameamount of data storage as shown in FIG. 9, the host 11's write I/Orequest sequence can be either in the same order as the data storage(illustrated in FIG. 10) or in a different order (illustrated in FIG.11) and then later reordered so that LMUAs 26 with the same data 12 aregrouped together. For example, in FIG. 11, the LMUA 26-43 with theheader 31-43 and the LMUA 26-44 with the header 31-44 are originallybehind the LMUA 26-42 with the header 31-42. The LMUA 26-43 and the LMUA26-44 may be reordered to be ahead of LMUA 26-42 so that they can havetheir headers 31-43 and 31-44 written in the PMU 32-41 together with theheader 31-41 such that all the three point to the piece of data 12-41,thereby achieving deduplication.

With the new mapping structures 42 and 43 of FIGS. 12 and 13,respectively, two mapping tables for data deduplication are introduced.FIG. 12 illustrates a deduplication mapping using the mapping structure42. In this embodiment, each LMUA 26 map entry contains a PMUA 25 and aRDEDUP field 75, where the RDEDUP is the number of headers 31 pointingto the LMUA 26's data 12. For example, the data 12-31 at the PMU 32-31has two headers 31-31 and 31-33 pointing to it, as shown in FIG. 8.Thus, the value of the RDEDUP field at PMUA 25-31 is “2” in the mapentry. This indicates that each LMUA 26 has an average length ofLDATA=SLMU/RDEDUP, which can be used for a GCS update during a host 11'swrite I/O.

Alternatively, the controller 20 can record this calculated LDATA in themap entry in place of RDEDUP and use the mapping structure 41 in FIG. 4.For example in FIG. 8, the RDEDUP is “2” for the PMUA 25-31. If theSLMU=4 kB, then an LDATA of 4 kB/2=2 kB, and will replace the RDEDUP of“2” in the field 75 of FIG. 14 for the entries LMUA 31 and PMUA 31. TheGCS of the GCU 60 where the PMUA 25-31 resides will then be increased by2 kB. Again garbage collection chooses the GCU 60 with the smallest GCSto recycle, or uses the GCS in combination with other performancemetrics.

In FIG. 13, the deduplication mapping uses the mapping structure 43. Inthis embodiment, each map entry contains one PMUA 26 field. This ispossible because the data 12 written to PMU 32s of FIG. 9 have a fixedsize and are each within one PMU 32. In this embodiment, the GCS cantake the form of the TLMU−VLMU pair discussed above or of a RCMP−VLUMpair to accomplish garbage collection. RCMP is calculated the same wayas discussed above. For example, FIG. 9 shows a TLMU=6 and a TPMU=2.Assuming that the SLMU=SPMU=4 KB, then RCMP=(6*4 KB)/(2*4 KB)=3. Thisindicates that each LMUA 26 has an LCMPD=SLMU/RCMP=4 kB/3=1.33 kB. Thenthe controller 20 reduces the valid number of bytes (GCS) in the GCU 60by 1.33 kB if the LMUA 26-41 updates its data 12 to be different fromcurrent data 12-41.

Generally, this is to calculate the LCMPD (e.g., the average length ofcompressed or deduplicated data). If a latter one of the three LMUAs hasits data updated to be different from current data 12-41, part of data12-41 will become invalid. The GCS in this case reduces by 1.33 kB(e.g., 1/3 of the true length of data 12-41), instead of 4 kB, the truelength of data 12-41. This is because two other LMUAs still have data12-41 as their valid data.

The embodiments herein can take the form of hardware, firmware,software, or a combination thereof. FIGS. 14 and 15 illustrate such acombination that may be operable to employ the systems and methodsdescribed herein. More specifically, FIG. 14 is a block diagram of anexemplary storage system 10 and its associated device controller (e.g.,the controller 20) and storage device 30. FIG. 15 is a block diagram ofa storage module 216 comprising storage systems 10 and their associatedcontrollers 20/storage devices 30 interfacing with a host system 202.

In FIG. 14, the controller 20 includes a host interface 111 that isoperable to interface with a host system to communicate I/O operationsof the host system. The host interface 111 may be configured with a tagtracking module 113 that is operable to track progress of individual I/Ocommands (e.g., read and write commands to certain addresses in thestorage device 30). The tag tracking module 113 may associate anexternal flag of a command received from the host system with aninternal flag that the controller 20 can access during processing of thecommand to identify the status of the processing.

The controller 20 also includes a data processing module 121 thatcomprises a processing engine 123 generally operable to perform certaintasks on data that is received from the host interface 111 or residingwithin a buffer 131, such as one or more of formatting the data,transcoding the data, compressing the data, decompressing the data,encrypting the data, decrypting the data, data deduplication, dataencoding/formatting, or any combination thereof. For example, aprocessing engine 123 of the data processing module 121 may be operableto process the I/O operation from an I/O module of the host systemgenerating the operation, such that the data of the I/O operation may bewritten to the physical address of the storage device 30. The processingengine 123 may extract the data of the write I/O command and prepare itfor storage in the storage device 30. In doing so, the processing engine123 may compress the data using any of a variety of data compressionalgorithms. When retrieving the data from the storage device 30, theprocessing engine 123 may decompress the data according to the algorithmused to compress the data for storage.

The buffer 131 is operable to store data transferred to and from thehost system 11. The buffer 131 may also store system data, such asmemory tables used by the controller 20 to manage the storage device 30,and any possible higher-level RAID functionality in the memory 137.Other modules may include an error correcting code (ECC-X) module 135 toprovide higher-level error correction and redundancy functionality, anda Direct Memory Access (DMA) module 133 to control movement of data toand from the buffer 131.

The controller 20 also includes an error correction code module 161operable to provide lower level error correction and redundancyprocessing of the data in the buffer 131 using any of a variety of errorcorrection codes techniques (e.g., cyclic redundancy checks, Hammingcodes, low-density parity check coders, etc.).

A device interface logic module 191 is operable to transfer data to andfrom the storage device 30 according to the protocol of the devicestherein. The device interface logic module 191 includes a schedulingmodule 193 that is operable to queue I/O operations to the storagedevice 30.

The controller 20 herein also includes a map module 141 that is operableto perform data addressing to locations in the storage device 30according to the map 21. For example, the map module 141 may use the map21 to convert logical block addresses (LBAs) from the host system 11 toblock/page addresses directed to the storage device 30. The map 21 maybe stored in whole or in part in controller 20 and/or in storage device30. For example, in some embodiments a portion of the map 21 may benon-volatilely cached in RAM of the controller 20.

A recycler 151 performs garbage collection on behalf of the controller20. For example, the recycler 151 may determine portions of the storagedevice 30 that are actively in use by scanning the map 21 of the mapmodule 141. In this regard, the recycler 151 may make unused, or“deallocated”, portions of the storage device 20 available for writingby erasing the unused portions. The recycler 151 may also move datawithin the storage device 30 to make larger contiguous portions of thestorage device 30 available for writing.

The controller 20 also includes a CPU 171 that controls various aspectsof the controller 20. For example, the CPU 171 may process instructionsor firmware to implement command management 173 that tracks and controlscommands received from the host system. This firmware may also implementbuffer management 175 that controls allocation and use of the buffer 131and translation management 177 or to control the map module 141. Thefirmware may also employ coherency management 179 to control consistencyof data addressing to avoid conflicts such as those that may occurbetween external data accesses and recycled data accesses. The firmwaremay also provide device management 181 to control the device interfacelogic module 191 and identity management 182 to control modification andcommunication of identity information of components within thecontroller 11.

In FIG. 15, the host system 11 is operable to process softwareinstructions and perform I/O operations with the storage module 216 toread from and write to one or more storage systems 10. In this regard,the host system 11 may include an operating system 205 that provides thecomputing environment for the host system 11. A driver 207 is operableto communicate through the link 206 to the storage module 216 to performthe I/O operations with the various storage systems 10 configuredtherewith.

Like other computing systems, the operating system 205 may be initiatedvia management software 214 (e.g., Bios software). The host system 11may also include application software 209 to perform various computingprocesses on behalf of the host system 11 (e.g., word processingapplications, image processing applications, etc.). The host system 11may also include I/O and storage functionality 217 operable to conductI/O operations with one or more servers 218 through a communicationnetwork 219 (e.g., the Internet, local area networks, wide-areanetworks, etc.). In this regard, the storage module 216 may act as acache memory of I/O operations for the host system 11.

The storage module 216 may be configured with an intermediate controller203 that is operable to switch various I/O operations of the host system11 to LBAs of the storage systems 10. In this regard, the storage module216 may include a memory 212 that stores mapping information for theintermediate controller 203 to conduct the I/O operations to the LBAs.The map module 141 of the controller 20 may also be operable to performdata addressing with variable-sized mapping units to locations in thestorage device 30 according to the map 21, and convert LBAs from thehost system 11 to block/page addresses directed to the storage device30.

It should be noted that the embodiments disclosed herein are not limitedto any type of storage device 30 as such may be implemented in otherpersistent storage devices, including HDDs, SSDs, magnetoresistivestorage devices, or the like.

What is claimed is:
 1. A method, comprising: processing an Input/Output(I/O) request to a storage device; extracting a logical mapping unitfrom the I/O request; determining that the I/O request is for variablelength data; accessing a map that links the logical mapping unit (LMU)to one or more physical addresses of the storage device; and calculatinga number of physical mapping units at the physical addresses to servicethe I/O request.
 2. The method of claim 1, wherein: the I/O request is aread I/O request.
 3. The method of claim 1, wherein calculating thenumber of physical mapping units at the physical addresses comprises:extracting a length of the variable length data from the map; anddividing the length by a number of data bytes stored in one physicalmapping unit.
 4. The method of claim 1, wherein: the storage device is aNAND flash storage device, phase-change memory, Random Access Memory(RAM), resistive RAM, a magnetoresistive storage device, a magneticrecording medium, or a combination thereof.
 5. The method of claim 1,further comprising: processing another I/O request to the storagedevice; extracting another logical mapping unit from the other I/Orequest; determining that the other I/O request is for fixed lengthdata; accessing a mapping structure in the map that links the otherlogical mapping unit to one physical address of the storage device,wherein the mapping structure omits a length of the data to service theI/O request.
 6. The method of claim 1, further comprising: determiningthat the data is a duplicate of previously written data in the storagedevice; reordering the data to improve deduplication efficiency; andgenerating a header associated with the LMU to point to the previouslywritten data, the previously written data having another headerassociated with another LMU.
 7. A storage system, comprising: a storagedevice; and a controller operable to process an Input/Output (I/O)request to the storage device, to extract a logical mapping unit fromthe I/O request, to determine that the I/O request is for variablelength data, to access a map that links the logical mapping unit to oneor more physical addresses of the storage device, and to calculate anumber of physical mapping units at the physical addresses to servicethe I/O request.
 8. The storage system of claim 7, wherein: the I/Orequest is a read I/O request.
 9. The storage system of claim 7,wherein: the controller is further operable to calculate a number ofphysical mapping units at the physical addresses by: extracting a lengthof the variable length data from the map; and dividing the length by anumber of data bytes stored in one physical mapping unit.
 10. Thestorage system of claim 7, wherein: the storage device is a NAND flashstorage device, phase-change memory, Random Access Memory (RAM),resistive RAM, a magnetoresistive storage device, a magnetic recordingmedium, or a combination thereof.
 11. The storage system of claim 7,wherein: the controller is further operable to processing another I/Orequest to the storage device, to extract another logical mapping unitfrom the other I/O request, to determine that the other I/O request isfor fixed length data, and to access a mapping structure in the map thatlinks the other logical mapping unit to one physical address of thestorage device, wherein the mapping structure omits a length of the datato service the I/O request.
 12. The storage system of claim 7, wherein:the controller is further operable to determine that the data is aduplicate of previously written data in the storage device, and togenerate a header associated with the LMU to point to the previouslywritten data, the previously written data having another headerassociated with another LMU.
 13. A method of garbage collection in astorage device, the method comprising: upon a write of data to a firstgarbage collection unit (GCU): incrementing a number of logical mappingunits stored in the first GCU; incrementing a number of logical mappingunits with valid data stored in the first GCU; decrementing a number oflogical mapping units with invalid data stored in a second GCU based onthe incremented number of logical mapping units with valid data storedin the first GCU; and erasing the second GCU when a valid data rate ofthe second GCU is below a valid data rate of the first GCU.
 14. Themethod of claim 13, further comprising: erasing the second GCU compriseswhen a valid number of logging mapping units is less than a valid numberof logging mapping units of the first GCU.
 15. The method of claim 13,further comprising: computing an average compression rate or adeduplication rate of the data written to the second GCU.
 16. The methodof claim 15, wherein computing comprises: computing a compressed lengthof the data written to the second GCU based on the average compressionrate or a deduplication rate of the data written to the second GCU and asize of a logical mapping unit stored in the second GCU.
 17. The methodof claim 16, wherein computing a compressed length of the data writtento the second GCU comprises: computing the size of a logical mappingunit stored in the second GCU and dividing the size by the averagecompression rate or a deduplication rate.
 18. The method of claim 15,further comprising: computing a size of valid data bytes of the secondGCU based on the number of logical mapping units with valid data storedin the second GCU, the size of a logical mapping unit stored in thesecond GCU, and the average compression rate.
 19. The method of claim13, further comprising: computing the valid data rate of the second GCUbased on a number of logical mapping units with valid data stored in thesecond GCU and a total number of logical mapping units stored in thesecond GCU.
 20. A method of garbage collection in a storage device, themethod comprising: upon a write of data to a first garbage collectionunit (GCU): incrementing a number of logical mapping units stored in thefirst GCU; incrementing a number of logical mapping units with validdata stored in the first GCU; decrementing a number of logical mappingunits with invalid data stored in a second GCU based on the incrementednumber of logical mapping units with valid data stored in the first GCU;and erasing the second GCU when the number logical mapping units withinvalid data stored in the second GCU reaches a total number of physicalmapping units of the second GCU.